Lesson: Parsing Data for Data Cleanse 


For example, with the input data expat@london.home.office.city.co.uk, Data Cleanse 
outputs each element in the following fields: 


Table 27: Data Cleanse Output Example 





Phone number data 


Data Cleanse can parse both North American (US and Canada) and international phone 
numbers. When Data Cleanse parses a phone number, it outputs the individual components 
of the number into the appropriate fields. 


Data Cleanse recognizes phone numbers by their pattern and (for non-US numbers) by their 
country code. For North American phone numbers, it looks for commonly used patterns such 
as (234) 567-8901, 234-567-8901, and 2345678901. It gives you the option for some 
reformatting on output, such as your choice of delimiters. 


Data Cleanse searches for European and Pacific-Rim numbers by pattern. The patterns used 
are defined from the US and require that the country code appear at the beginning of the 
number. Note that Data Cleanse does not offer any options for reformatting international 
phone numbers or cross-compare to the address to see if the country and city codes in the 
phone match the address. 


Date data 


Data Cleanse can parse up to six dates from your defined record. Data Cleanse identifies the 
dates in the input, breaks dates into day/month/year components, and makes dates 
available as output in either the original format such as DD-MMNM-YY or a user-selected 
standard format such as MM/DD/YYYY. 


International data 


By default, Data Cleanse can identify international data presented in multiple formats. There 
are also several ways that you can use Data Cleanse to identify and manipulate various forms 
of other international data, including prenames, greetings, and personal identification 
numbers. 


e Customizing greetings and prenames per country — The default prenames and salutations 
found in the Data Cleanse greetings option group are commonly used in English-speaking 
nations. For countries where English is not the primary language, you can modify these 
options to reflect common prenames and salutations. 
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